Listen to this short music snippet:
Do you find this snippet beautiful? Are you familiar with the song? Do you think this song fits with your overall genre preferences?
We are curious to know what factors influenced the perception of beauty the most. After listening to the song, could you estimate what are those factors? This question is at the heart of our research. Click through the website to check if your predictions are accurate!
To compile suitable stimuli, a pre-test was conducted. After choosing 15 stimuli snippets, we created a Qualtrics survey. Firstly, participants had to answer basic background questions, namely age, gender and country of origin. Afterwards, they were presented with a short version of Gold-MSI test. Then, participants were presented with the 15 songs in a random order. They had to choose whether they found the snippet beautiful or not. Additionally, they were asked if they were familiar with the snippet. Participants could listen to the song as many times as they like. Finally, they had to choose three favorite genres based on a STOMP selection. All in all, we collected the data from 119 participants.
Such procedure helps to divide people into classes by their musical preferences, and then check whether there are any significant changes in characteristics per class.
Characteristics of the Sample
In this section, we will analyse the graphs to determine the characteristics of our sample.
Before we begin with the analysis, we would like to mention that our data might be a bit biased, as our participants are predominantly females that are in their 20s.
Click on the tab Nationality on the right to see the graph
As you can see from the graph, the majority of our participants are from The Netherlands. Nevertheless, we are proud to mention that we gathered a diverse group of participants. The respondents come from 26 different countries from all over the world: Asia, Africa, Europe and America. The majority of our participants are either European or Asian.
Click on the tab Genre Preferences on the right to see the graph
To assess genre preferences, we asked participants to rank three of their best preferred genres. The majority of our sample favours pop music. This might be due to the fact that our participants are predominantly in their 20s. Tied in the second place are classical and rock genres. Bluegrass proved to be the least favourite genre. One feasible explanation is that bluegrass is overall less common than the other genres, thus unfamiliar to the majority of participants.
Click on the tab STOMP on the right to see the graph
The Short Test of Music Preference (STOMP) is designed to assess music preferences that are related to personality variables, self-views and cognitive abilities. The test consists of 4 categories:
Reflective & Complex: classical, blues, folk, jazz, etc.
Intense & Rebellious: alternative, rock, heavy metal, punk, etc.
Upbeat & Conventional: country, religious, pop, soundtracks, etc.
Energetic & Rhythmic: electronica, rap, soul, funk, etc.
The genre preferences were divided according to these groups.
As can be seen, Upbeat & Conventional category is the most liked. Genres from Energetic & Rhythmic category are least preferred by the participants.
Click on the tab Musical Sophistication and Genre Preference on the right to see the graph
To test individual differences in musical sophistication, we used the short version of Goldsmiths Musical Sophistication Index (Gold-MSI). The following 5 aspects are measured using a self-report questionnaire:
active musical engagement: the amount of time and resources spent on music;
self-reported perceptual abilities: the accuracy of musical listening skills;
musical training: the amount of formal musical training received;
self-reported singing abilities: the accuracy of singing;
sophisticated emotional engagement with music: the ability to talk about the emotions that music expresses.
According to the test, the higher the overall score is (on a scale from 18 to 125), the more musically sophisticated the person is. To see if musical sophistication affects genre preferences, we looked at the distribution of the Gold-MSI scores in STOMP groups.
From the graph, we can infer that all 4 categories have pretty similar median scores. Unsurprisingly, the highest Gold-MSI median is in Reflective & Complex category. In fact, the differences between categories are not significant. Hence, we can conclude that musical sophistication does not have much effect on genre preferences.
The majority of people would agree that perception of beauty, especially in music, is a highly subjective phenomenon. Or is it? In our research, we aim to explore the relationship between beauty assessment and musical sophistication.
Previous research done on aesthetics and music mainly focusses on personality traits and how those influence perception. For instance, awe is one of the profound aesthetic experiences, often described as being touched, moved, fascinated and amazed. It was found that people who are more open to experience are more susceptible to awe-like states (Silvia et al., 2015). Interestingly, the study was conducted in 2 domains, that is visual and auditory stimuli were used. Across both domains, openness to experience was the only factor predictive of the higher experience of awe. One of the drawbacks of the methodology is that judgements were made by listening to only one song (‘Hoppípolla’ by Sigur Rós). Furthermore, although none of the participants understood the Icelandic language, the overall perceived ‘melodicity’, familiarity, etc. of the language could have affected the perception of the song (Jenkin, 2014). In the present study, we use only instrumental music to control for the language factor. Furthermore, 15 music snippets were chosen as stimuli.
Usually, listening to music is an aesthetic experience that requires activation of not only affective, but also cognitive and evaluative processes. Studies have found that music expertise modulates the cortical processing of different aspects of music perception (e.g. Atienza et al., 2002; Bosnyak al., 2004). Cognitive researchers (Müller et al., 2010) compared aesthetic judgements between experts and laypersons by using event-related potential (ERP) measurements. They found that when exposed to the same stimuli, experts’ and laypersons’ ERP measures systematically differed. We believe that if there is a difference in aesthetic judgements between groups of experts and laypersons on the ‘brain’ level, then there should be an observable distinction on a more conscious level, too. As in Müller’s paper, the question of whether the piece is beautiful or not, as opposed to ‘do you like it’, used in the majority of previous papers (e.g. Brattico & Jacobsen, 2009) will be used to quantify beauty assessment. In this way, the question becomes more linguistically sound and precise.
Sophistication is not the only factor that could potentially influence beauty perception. Genre preference might also have a profound impact on whether the participant finds a piece beautiful or not (Istók et al., 2013). The modernist view of music aesthetics (Burke & Gridley, 1990) supports the idea of genre hierarchy. This theory states that complex music, such as jazz, is less popularly valued because of its high intellectual demand. Followers of the theory would argue that jazz is a genre that can be comprehended and appreciated only by musically sophisticated individuals. For this reason, our stimuli includes a variety of genres, for instance jazz (Drama in Six Notes ) bluegrass (Less is Moi ) and electronic (syro u473t8+e [141.98] ). In addition, we will check if certain genre preferences correspond with higher musical sophistication scores.
Inspired by the aforementioned literature and also personal experiences, we would like to see if musical sophistication influences the perception of beauty. We hypothesize that higher scores for music sophistication will align with higher scores for beauty. Furthermore, we will analyze the potential correlation between said scores and genre preference.
In the article “Are Musicians Particularly Sensitive to Beauty and Goodness” ( Güsewell & Ruch, 2014) the degree and form of musical practice of participants is compared to responsiveness to artistic, natural and non-aesthetic beauty and goodness. This was examined using self-report and stimulus-based instruments.
It was found that professional musicians had the highest scores in responsiveness to artistic beauty, experience seeking, and absorption compared to the other groups. The amateur musicians scored highest on overall responsiveness, responsiveness to non-aesthetic goodness and responsiveness to nature. This supports the hypothesis that there is a link between sensitivity to beauty, goodness and musical practice.
From the data, researchers concluded that the responsiveness to beauty was related to the degree of involvement in musical practice. It was suggested that the opportunity to artistically express oneself was needed for a balanced responsiveness to the beauty profile. The groups that scored highest in responsiveness to beauty were believed to have more opportunities to express themselves (amateurs and soloists) or take part in musical activities where strong expressive and artistic involvement was needed (high-level orchestra musicians).
However, the participants were not grouped based on the opportunity for expression through music. Thus, based on the results of soloists and amateur musicians it is not yet possible to conclude that personal interpretation of music increases responsiveness to beauty. In our study, we focus on the musical sophistication of a larger sample, including participants with both low and high scores. Such a sample might provide a more in-depth look at the relationship between musical experience and perception of beauty.
Contact Information
This research project was conducted as a part of an Honours “The Data Science of Everyday Music Listening” course at the University of Amsterdam (coordinated by dhr. dr. John Ashley Burgoyne).
Research team members: Xiaoqing Li Esther Liefting Willem Pleiter Denise Quek Nikita van ’t Rood Kristina Savickaja
If you have any questions or comments, please contact us at music.beauty.aesthetics@gmail.com
To create the stimuli, we searched online for known datasets that include musical pieces that were dissected according to their musical components. Since the search yielded no results, each member of the group supplied 5 instrumental songs that they found beautiful via Spotify. The chosen songs had to be instrumental to control for the influence of language. Afterwards, the compiled songs were evaluated by 3 musical experts (10+ years of formal musical training). Experts had to rate 30 second songs snippets on a 10-point Likert scale on the following criteria:
Tempo: the general pulse of the song, very slow (1) - very fast (10)
Articulation: the rhythmic articulation of the song, very staccato (1) - completely legato (10)
Mode: overall mode and feel of the song, minor (1) - major (10)
Intensity: overall loudness, crescendos and decrescendos in a song, pianissimo (1) - fortissimo (10)
Tonalness: overall tonalness of the composition, very atonal (1) - very tonal (10)
Pitch: overall distribution of the pitch, all bass (1) - all treble (10)
Melody: overall presence and dominance of melody, very unmelodious (1) - very melodious (10)
Rhythmic Clarity: overall presence of a pulse, very vague (1) - very firm (10)
Rhythmic Complexity: the extent to which different meters, odd tempo or complex rhythmic patterns are utilized, very simple (1) - very complex (10)
This method is based on the evaluation system used by Aljanaki et al. (2016). The final selection of 15 stimuli songs was chosen based on A) Feature Representability and B) Reliability.
A) Feature Representability
The panel on the right is interactive, hover over a point with your mouse to find out more
The combined box and jitterplot shows the overall distribution of the characteristics of the songs. While the boxplot represents the feature values of all 30 songs, the jitterplot illustrates the feature values of the 15 chosen stimuli songs.
As can be seen from the jitterplot, our final selection covers quite a large range for most parameters.
To finalize our selection of stimuli songs, we first estimated the reliability of the expert ratings per song. To do this, the distance scores between each of the expert‘s evaluation was computed. For example, each evaluator rated a song on Tempo. If the first rater gave it a 5, the second rater gave it a 6 and the third - 7, the distance is then calculated by taking the distance between the first and the second rater (6 - 5 = 1); the second and the third rater (7 - 6 = 1) and the distance between the first and the third rater (7 - 5 = 2). The sum of the differences (1 + 1 + 2 = 4) provides an estimate of a consensus for Tempo. Subsequently, this process was repeated for all components per song. Then, all reliability scores per component were summed to give an estimate of overall reliability. The table on the right shows these scores for all 30 songs.
As can be seen from the table, the reliability scores range between 26 and 64, with a lower score representing stronger consensus. Based on these scores, we estimated a cut-off point of < 45 and selected the final stimuli.
However, upon examining our prior selection, it became apparent that the distribution of melody ratings was skewed in favour of very melodious songs. Furthermore, there was a lack of atonal songs. To tackle these issues, we decided to discard The Kiss (reliability score of 42) and add Drama In Six Notes (reliability score of 46) instead.
Blueming
Bygone Bumps
Cia Pat
Decision (Price of Love)
Elysium
Firth Of Fifth
Less Is Moi
Married Life
Resolver
Scarface Theme
Single Petal Of A Rose
Song For A New Beginning
syro u473t8+e
Šešių Natų Drama (Drama In Six Notes)
USA III Rail
Latent Class Analysis (LCA) is a psychometric method in which participants are grouped based on how likely they would respond positive to a certain survey item. In our case, the said item is the song snippet that the participant find beautiful or not.
After running the data from 119 participants through the LCA, only a 3-class model fitted the data appropriately (refer to the ‘ LCA Class Table’). The table shows the probability of a person belonging to a particular class based on their beauty judgement (yes/no) of a specific snippet. For instance, for USA III RAIL, a person belonging to ‘Likers’ group has a 72% chance of rating the song beautiful, ‘Indifferents’ - 49% and ‘Dislikers’ - 0%. These conditional probabilities are available for all classes, so you can hover over the table to see which songs were generally very liked and which were disliked.
As you may have figured out, the names of classes correspond with the quantity of songs that they find beautiful. Thus, ‘Likers’ find the majority of snippets beautiful, ‘Indifferents’ are the middle group and ‘Dislikers’ find most of the songs not beautiful. ‘Likers’ comprised 38% of our sample, ‘Indifferents’ - 48% and ‘Dislikers’ - 14%.
ANOVA
On the second tab, the post-hoc and ANOVA are shown.
In our analysis, ANOVA is a method in which means and variance between classes are compared to see whether the differences in Gold-MSI scores are significant. To find out which particular classes differ from each other, a Tukey’s test was performed. The test compares the groups one by one, instead of all at once like ANOVA does.
On x-axis, the size of the difference between groups (illustrated on y-axes) can be seen. The visible intervals are 95% confidence intervals - the intervals that in 95% of the cases will contain the “true” difference in Gold-MSI score.
To illustrate the Tukey’s test further, since the red lines do not contain a value of 0, we can confidently say that if we were to repeat the experiment an infinite amount of times, that in 95% of the time they will contain the “true” difference. That allows us to refute the idea that these groups do not differ in Gold-MSI scores.
In conclusion, the plot shows evidence that the ‘Likers’ tend to have higher musical sophistication scores than both ‘Dislikers’ and ‘Indifferents’. There is no difference between ‘Dislikers’ and ‘Indifferents’. These results confirm our hypothesis that higher Gold-MSI score will correspond with a higher number of “yes, I find this song beautiful” answers.
Here we will interpret the graphs
Interpretation will follow, also add why we didn’t include other graphs (Nikita will know)
Here we will interpret the graphs
Here we will visualize some Gender, STOMP-scores, Gold-MSI and other types of class characteristics that are interesting in a tabset form